{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using Ragas Critic Model instead of GPT-4\n",
    "\n",
    "Synthetic test data generation using LLMs for two purposes:\n",
    "1. Generation of QA pairs, evolution, etc\n",
    "2. LLM as Critic model to give feedback to generated QA pairs to ensure and improve quality\n",
    "\n",
    "We have built and opensourced a [custom model](https://huggingface.co/explodinggradients/Ragas-critic-llm-Qwen1.5-GPTQ) as critic model to be used instead of GPT-4 (default). This model is avaialble here for free and can deliver upto 200 tokens per second of an A10 instance. \n",
    "\n",
    "Follow the rest of the notebook to use this model as critic model instead of GPT-4.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Importing required modules"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/shahul/.conda/envs/ragas/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
      "  from .autonotebook import tqdm as notebook_tqdm\n"
     ]
    }
   ],
   "source": [
    "from langchain_openai import ChatOpenAI\n",
    "import os\n",
    "\n",
    "\n",
    "from ragas.testset.prompts import (\n",
    "    context_scoring_prompt,\n",
    "    evolution_elimination_prompt,\n",
    "    filter_question_prompt,\n",
    ")\n",
    "from langchain_community.document_loaders import DirectoryLoader\n",
    "from ragas.testset.generator import TestsetGenerator\n",
    "from ragas.testset.evolutions import simple, reasoning, multi_context"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setting up generator Model (gpt-3.5)\n",
    "Any model to be used as generator - here gpt 3.5 or use any models by checking [docs](https://docs.ragas.io/en/stable/howtos/customisations/bring-your-own-llm-or-embs.html)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "os.environ[\"OPENAI_API_KEY\"] = \"key\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setting up Model\n",
    "Ragas critic can generate upto 200 tokens/sec on a single A10 instance"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Host the model using VLLM**\n",
    "\n",
    "Run this on your terminal with GPU enabled\n",
    "```\n",
    "python -m vllm.entrypoints.openai.api_server --model explodinggradients/Ragas-critic-llm-Qwen1.5-GPTQ\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "inference_server_url = \"http://localhost:8000/v1\"\n",
    "MODEL = \"explodinggradients/Ragas-critic-llm-Qwen1.5-GPTQ\"\n",
    "chat = ChatOpenAI(\n",
    "    model=MODEL,\n",
    "    openai_api_key=\"token-abc123\",\n",
    "    openai_api_base=inference_server_url,\n",
    "    max_tokens=2048,\n",
    "    temperature=0,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Set up custom Critic Model instead of GPT-4"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# remove demonstrations from examples\n",
    "for prompt in [\n",
    "    context_scoring_prompt,\n",
    "    evolution_elimination_prompt,\n",
    "    filter_question_prompt,\n",
    "]:\n",
    "    prompt.examples = []"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "from ragas.testset.filters import QuestionFilter, EvolutionFilter, NodeFilter\n",
    "\n",
    "\n",
    "from ragas.llms import LangchainLLMWrapper\n",
    "\n",
    "langchain_llm = LangchainLLMWrapper(chat)\n",
    "\n",
    "qa_filter = QuestionFilter(langchain_llm, filter_question_prompt)\n",
    "node_filter = NodeFilter(langchain_llm, context_scoring_prompt=context_scoring_prompt)\n",
    "evolution_filter = EvolutionFilter(langchain_llm, evolution_elimination_prompt)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "distributions = {simple: 0.5, reasoning: 0.25, multi_context: 0.25}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# customise the filters\n",
    "from ragas.testset.evolutions import ComplexEvolution\n",
    "\n",
    "for evolution in distributions:\n",
    "    if evolution.question_filter is None:\n",
    "        evolution.question_filter = qa_filter\n",
    "    if evolution.node_filter is None:\n",
    "        evolution.node_filter = node_filter\n",
    "\n",
    "    if isinstance(evolution, ComplexEvolution):\n",
    "        if evolution.evolution_filter is None:\n",
    "            evolution.evolution_filter = evolution_filter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Loading data "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "fatal: destination path 'prompt-engineering-guide-papers' already exists and is not an empty directory.\n"
     ]
    }
   ],
   "source": [
    "! git clone https://huggingface.co/datasets/explodinggradients/prompt-engineering-guide-papers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "loader = DirectoryLoader(\"./prompt-engineering-guide-papers/\", glob=\"*.pdf\")\n",
    "documents = loader.load()\n",
    "\n",
    "for document in documents:\n",
    "    document.metadata[\"filename\"] = document.metadata[\"source\"]\n",
    "\n",
    "documents = [doc for doc in documents if len(doc.page_content.split()) > 5000]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Generating "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/tmp/ipykernel_10833/120543537.py:1: DeprecationWarning: The function with_openai was deprecated in 0.1.4, and will be removed in the 0.2.0 release. Use from_langchain instead.\n",
      "  generator = TestsetGenerator.with_openai(chunk_size=512)\n",
      "Generating:  90%|█████████ | 9/10 [00:12<00:01,  1.44s/it]        Failed to parse output. Returning None.\n",
      "Failed to parse output. Returning None.\n",
      "Generating: 100%|██████████| 10/10 [00:18<00:00,  1.88s/it]\n"
     ]
    }
   ],
   "source": [
    "generator = TestsetGenerator.with_openai(chunk_size=512)\n",
    "testset = generator.generate_with_langchain_docs(\n",
    "    documents[:10],\n",
    "    test_size=10,\n",
    "    raise_exceptions=False,\n",
    "    with_debugging_logs=False,\n",
    "    distributions=distributions,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>question</th>\n",
       "      <th>contexts</th>\n",
       "      <th>ground_truth</th>\n",
       "      <th>evolution_type</th>\n",
       "      <th>metadata</th>\n",
       "      <th>episode_done</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>What is GPT-Neo and its significance in the fi...</td>\n",
       "      <td>[ in robotic affordances, 2022. URL https://ar...</td>\n",
       "      <td>GPT-Neo is a large-scale autoregressive langua...</td>\n",
       "      <td>simple</td>\n",
       "      <td>[{'source': 'prompt-engineering-guide-papers/2...</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>What action did the assistant take after findi...</td>\n",
       "      <td>[ can you bring me some chips.\\n\\nExplanation:...</td>\n",
       "      <td>nan</td>\n",
       "      <td>simple</td>\n",
       "      <td>[{'source': 'prompt-engineering-guide-papers/2...</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>What is the bootstrapping version of Auto-CoT ...</td>\n",
       "      <td>[\\n8\\n\\n9 10\\n\\nFigure 6: Effect of wrong demo...</td>\n",
       "      <td>The bootstrapping version of Auto-CoT is calle...</td>\n",
       "      <td>simple</td>\n",
       "      <td>[{'source': 'prompt-engineering-guide-papers/2...</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>What is the purpose or function of Few-Shot-CoT?</td>\n",
       "      <td>[ candy last her? A: Megan received 11 pieces ...</td>\n",
       "      <td>nan</td>\n",
       "      <td>simple</td>\n",
       "      <td>[{'source': 'prompt-engineering-guide-papers/2...</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>What is the focus of the paper \"Zero-shot text...</td>\n",
       "      <td>[, China. Association for Computational Lingui...</td>\n",
       "      <td>The focus of the paper \"Zero-shot text classif...</td>\n",
       "      <td>simple</td>\n",
       "      <td>[{'source': 'prompt-engineering-guide-papers/2...</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>How can diversity-based sampling in Auto-CoT m...</td>\n",
       "      <td>[ multiple similar questions inside a frequent...</td>\n",
       "      <td>The clustering-based sampling method in Auto-C...</td>\n",
       "      <td>reasoning</td>\n",
       "      <td>[{'source': 'prompt-engineering-guide-papers/2...</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>What error category did the model miss when de...</td>\n",
       "      <td>[ was missed by the model. An example of this ...</td>\n",
       "      <td>one step missing error</td>\n",
       "      <td>reasoning</td>\n",
       "      <td>[{'source': 'prompt-engineering-guide-papers/2...</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Q: If Luke made 9 dollars mowing lawns and 18 ...</td>\n",
       "      <td>[ pick up 9 trays from one table and 7 trays f...</td>\n",
       "      <td>Let’s think step by step. Luke made 9 dollars ...</td>\n",
       "      <td>multi_context</td>\n",
       "      <td>[{'source': 'prompt-engineering-guide-papers/2...</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>How can the number of trees planted by the gro...</td>\n",
       "      <td>[ION: Can you bring me something salty?\\n\\nMOD...</td>\n",
       "      <td>There are 21 trees after the grove workers pla...</td>\n",
       "      <td>multi_context</td>\n",
       "      <td>[{'source': 'prompt-engineering-guide-papers/2...</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Q: If Megan received 11 pieces of candy from n...</td>\n",
       "      <td>[ the number of trees they planted. So, they m...</td>\n",
       "      <td>Megan received a total of 16 pieces of candy a...</td>\n",
       "      <td>multi_context</td>\n",
       "      <td>[{'source': 'prompt-engineering-guide-papers/2...</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                            question  \\\n",
       "0  What is GPT-Neo and its significance in the fi...   \n",
       "1  What action did the assistant take after findi...   \n",
       "2  What is the bootstrapping version of Auto-CoT ...   \n",
       "3   What is the purpose or function of Few-Shot-CoT?   \n",
       "4  What is the focus of the paper \"Zero-shot text...   \n",
       "5  How can diversity-based sampling in Auto-CoT m...   \n",
       "6  What error category did the model miss when de...   \n",
       "7  Q: If Luke made 9 dollars mowing lawns and 18 ...   \n",
       "8  How can the number of trees planted by the gro...   \n",
       "9  Q: If Megan received 11 pieces of candy from n...   \n",
       "\n",
       "                                            contexts  \\\n",
       "0  [ in robotic affordances, 2022. URL https://ar...   \n",
       "1  [ can you bring me some chips.\\n\\nExplanation:...   \n",
       "2  [\\n8\\n\\n9 10\\n\\nFigure 6: Effect of wrong demo...   \n",
       "3  [ candy last her? A: Megan received 11 pieces ...   \n",
       "4  [, China. Association for Computational Lingui...   \n",
       "5  [ multiple similar questions inside a frequent...   \n",
       "6  [ was missed by the model. An example of this ...   \n",
       "7  [ pick up 9 trays from one table and 7 trays f...   \n",
       "8  [ION: Can you bring me something salty?\\n\\nMOD...   \n",
       "9  [ the number of trees they planted. So, they m...   \n",
       "\n",
       "                                        ground_truth evolution_type  \\\n",
       "0  GPT-Neo is a large-scale autoregressive langua...         simple   \n",
       "1                                                nan         simple   \n",
       "2  The bootstrapping version of Auto-CoT is calle...         simple   \n",
       "3                                                nan         simple   \n",
       "4  The focus of the paper \"Zero-shot text classif...         simple   \n",
       "5  The clustering-based sampling method in Auto-C...      reasoning   \n",
       "6                             one step missing error      reasoning   \n",
       "7  Let’s think step by step. Luke made 9 dollars ...  multi_context   \n",
       "8  There are 21 trees after the grove workers pla...  multi_context   \n",
       "9  Megan received a total of 16 pieces of candy a...  multi_context   \n",
       "\n",
       "                                            metadata  episode_done  \n",
       "0  [{'source': 'prompt-engineering-guide-papers/2...          True  \n",
       "1  [{'source': 'prompt-engineering-guide-papers/2...          True  \n",
       "2  [{'source': 'prompt-engineering-guide-papers/2...          True  \n",
       "3  [{'source': 'prompt-engineering-guide-papers/2...          True  \n",
       "4  [{'source': 'prompt-engineering-guide-papers/2...          True  \n",
       "5  [{'source': 'prompt-engineering-guide-papers/2...          True  \n",
       "6  [{'source': 'prompt-engineering-guide-papers/2...          True  \n",
       "7  [{'source': 'prompt-engineering-guide-papers/2...          True  \n",
       "8  [{'source': 'prompt-engineering-guide-papers/2...          True  \n",
       "9  [{'source': 'prompt-engineering-guide-papers/2...          True  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "testset.to_pandas()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "ragas",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
